Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Fix binary search #3

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Fix binary search #3

wants to merge 1 commit into from

Conversation

sleepsort
Copy link

Basically we cannot reuse exit status (quant_A, mid) as final result, as the following log shows, current logic will produce quantized vector with mean_err > target_error.

Since there are log(N) candidate nodes in the search tree, It would be simpler / easier if we just calculate it in the last pass.

Log before fix:
https://gist.github.com/sleepsort/1d0d1582825c6ea248eca19a0a471d1f

Log after fix:
https://gist.github.com/sleepsort/64ac49af24dc98eefa3ea8c4575ab3b6

To reproduce:

Copy link
Owner

@ot ot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was sort-of intentional, we just care of being close enough to the target. Because of the invariant described in the inline comments, the algorithm is correct, the "fix" would be to use "high" quantization rather than "mid". Of course then you need to recompute the results with that value.

It is not clear that your fix is correct.

@@ -64,19 +64,23 @@ def quantize_array_lbda(A, lbda):
def quantize_array_target(A, target_err):
low = 1
high = 128
mid = low
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need this

while high - low > 1:
mid = (high + low) / 2
while low < high:
mid = low + (high - low) / 2
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python has arbitrary precision integers, no need to do this

quant_A, dequant_A = quantize_array(A, mid)
mean_err = np.mean(np.sqrt(np.sum((dequant_A - A)**2, axis=-1)) / A_norms)
logging.info("Binary search: q=%d, err=%.3f", mid, mean_err)

if mean_err > target_err:
low = mid
low = mid + 1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the loop invariant that error(high) <= target_err < error(low)

else:
high = mid

mid = low
quant_A, dequant_A = quantize_array(A, mid)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is copypasta

@sleepsort
Copy link
Author

sleepsort commented Oct 2, 2016

Yeah, I understand that the result is always near the error boundary. Just to make sure the codes express the intention clearly.

And no, the algorithm is not right, simply returning high doesn't solve the problem, a simple counter-example is an array of the same number [0.1] * 128, the algorithm will return indice 1 instead of indice 0, which is surely wrong.

This error comes from the fact that you're using high - low > 1 as the position constraint. We cannot modify this to narrow the position, otherwise due to the int truncation, mid = (low + high) / 2 will always be low in some cases and will cause an infinite for loop

I wrote an example test to show that my algorithm is right (I can try to prove it later), and current codes have problems.

https://gist.github.com/sleepsort/708de5484d8484471ed33c4230790d39

@ot
Copy link
Owner

ot commented Oct 2, 2016

a simple counter-example is an array of the same number [0.1] * 128, the algorithm will return indice 1 instead of indice 0, which is surely wrong.

That doesn't satisfy the invariant I mentioned in the comment. What is the invariant of your algorithm?

@sleepsort
Copy link
Author

sleepsort commented Oct 2, 2016

That doesn't satisfy the invariant I mentioned in the comment

Ah I see, sorry I didn't understand the term invariant previously. I don't know in real case whether we'll have the same mean_err for some ranges, how about this?

[4, 3] target = 4

Just in theory, it is possible that we miss a better compression method by returning the indice 1 instead of 0. And this example breaks the invariant as well.

I might need some time to figure out the invariant in my algorithm... My intention is to make sure that error(low) > target_error will never happen, unless all the numbers in the array has error(i) > target_error, which already meets the exit condition. Can that be considered as an invariant?

@ot
Copy link
Owner

ot commented Oct 2, 2016

That still doesn't satisfy the invariant (in this case, that error(low) > target). I'm not sure you're familiar with standard algorithm correctness proofs, see for example https://en.wikipedia.org/wiki/Loop_invariant.

In the algorithm as it is now, we have:
Precondition: error(high) <= target < error(low)
Invariant: Same as precondition
Postcondition: same as precondition, plus high - low = 1

From the postcondition you can derive that low is the largest integer with error > target, and high is the smallest integer with error <= target. So if you want the best compression such that the error is <= target, just use high. So, why do you want to change binary search implementation?

@sleepsort
Copy link
Author

Thanks for the link, I'll check it later.

Yes, your algorithm is right if the invariant is true. My point is that, we cannot always assert that the real data meets this invariant. For example edge cases sush as error(i) == error(j), and error(low) == target. The codes are vulnerable to those cases, which might happen in production.

And my initial intention was to make this binary search more general.

@ot
Copy link
Owner

ot commented Oct 2, 2016

For example edge cases sush as error(i) == error(j)

This is not an edge case, and the algorithm still works. About the preconditions, you can check them before running the search and decide what to do if they are not met: in this case it means that the initial range is not wide enough and we should just stop the algorithm, rather than trying to return an endpoint of the range.
I prefer using an algorithm that has obvious guarantees. What are the guarantees of your version? Can you prove them?

@sleepsort
Copy link
Author

sleepsort commented Oct 2, 2016

For example edge cases sush as error(i) == error(j)
This is not an edge case, and the algorithm still works.

Oh, to clarity, I meant it breaks the precondition instead of invariant.
Since the precondition is: error(high) <= target < error(low) ==> error(high) < error(low). error(high) == error(low) is the so-called edge case. Anyway, it can be pre-checked.


And I understand the difference in our preferences : ). My preference is on implementation that handles all possible corner cases (my understanding of general) and with less codes.

As my understanding, to implement your algorithm to make sure it is error-immune, the code will be like

// quantize & dequantize on low to make sure `target < error(low)` [1]
// quantize & dequantize on high to make sure `error(high) <= target` [2]
while (...) {
  // quantize & dequantize on mid to find optimal solution [3]
}
// quantize & dequantize on high to get result [4]

And my algorithm can remove [1], [2], and report exception of [2] when [4] is calculated. No matter what is the initial setting of low & high.


About the invariant in my algorithm, how about this?
Assumption: error(0) = target + 1, error(128) <= target
Precondition: error(low - 1) > target >= error(high)
Invariant: Same as precondition
Postcondition: same as precondition, plus low = high

First assumption is dummy, and exception handling of second assumption can be handled after the for loop, so the code looks like:

while (...) {
  // quantize & dequantize on mid to find optimal solution [3]
}
// quantize & dequantize on high to get result [4]
if (error(high) > target) {
  // exception handling
}
return high

And to explain the loop in my codes:

  • every time after f moves, assert(A[f-1] > target);
  • every time after t moves, assert(A[t] <= target), (In loop, assert(m < t), so t = m moves t)
  • outside while loop, assert(f == t) & assert(A[t-1] > target). The only case when A[t] <= target is not true, is t hasn't moved.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants